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1 ^1 . A method for removing noise from a digital representation flata images mti — ~ 

2 noise images produced by digital scanning of a document, comprising: ^< 

3 (a) performing an object grabbing operation on>me digital representation to 

4 obtain all objects of the document; / 

y3 ^ (b ) identifying objects that rejtfresent essential information of the document 

7jE> and marking them as data objects; and / 

r|7 (c) reconstructing a digital representation of a reduced noise version of the 

8 document consisting of all of the marked data objects. 

1 2. A methpd for removing noise from a digital representation of essential images and 

. 2 noise images produced by digital scanning of a document, comprising: 

3 (a) / performing an object grabbing operation on the digital representation to 

4 identify adjoining pixels which form objects; 



5 (p)^<5perating a processor so as to identify a3ata^a* : ea4i^ digital 
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represe ntation that i n rhH ar fo" nkjnntn n*h\nh nnnrtitntp tVip pcgpntial images: 

(c) operating the processor to perform a logk^al ANDing operation between 
the data area portion and the digital representation to pr^yfde a digital representation of the 
essential images without the noise images; and 

(d) operating the processor to eliminate the noise images located outside of 
the data areas from the digital representation to provide another digital representation of the data 
images. 




3. A methoal for removing noise from a digital representation of essential images and 
noise images producer by digital scanning of a document, comprising: 

(a) / operating a processing system to perform an object grabbing operation on 
the digital representation to identify adjoining pixels which form objects; 



) operating the processing system to identify data objects of the digital 
representation tftiat constitute essential images; 



(c) opspatinglEe processing systemto^TTaric^leidentified objects as data, and 



8 "operating the processing system to eliminate all objects not marJ^ecTas data 

9 from the digital representation to provide a reconstructed digital representatipjr'of the essential 
1 0 images without the noise images. 



4. A method for producing a clearied-up digital image of a document including 
essential data images and undesired nois/s images, comprising: 



(a) digitally scanning the document to produce a first digital representation of 
the data images and the nois/ images; 
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(b) ^performing a first object grabbing operation on the first digital 
representation to identify all object images thereof; 



7 /(c) determining a skew angle of a straight line having a predetermined 

8 relationship/to some objects representative of the essential data images and de-skewing the 

9 documen^y rotating the first digital representation by an amount equal to the magnitude of the 
1 0 skew angle to provide a de-skewed first digital representation; 



11 
12 
13 



14 (e) ^i dentifying a portion-uf Hie de-skewed firsi digital ^presentation" 

1 5 corresponding to a picture region of the document; 



16 (f) producing a reduced-resolution representation of the de-skewed first 

1 7 digital representation and performing a second object ^bbing operation on the reduced- 

1 8 resolution representation; 



(g) identifying object of the reduced-resolution representation representing 




essential data areas of the document; and 



24 (h) constructing the cleaned-up digital image of the document by performing a 

logical ANDing operation between the picture region and the data areas with the de-skewed first 
? 3 digital representation to eliminate all objects outside of the picture region and the data areas to 
%\ provide the clearied-up digital image. 



1 / 5. A method for producing a cleaned-up digital image of a document including 

2 essential data images and undesired noise images, comprising: 



(a) digitally scanning the document to produce a first digital representation of 



thedatartmages and then5re©-ioiages; 
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(h) ppr^camm^u fit'.^l nhjert grabbing operation on rhe nrst mgiiarK 

representation to identify all object images thereof; 

(c) determining a skew angle o^straight line having a predetermined 
relationship to some objects representative ofthe essential data images and de-skewing the 
document by rotating the first digital pdpresentation by an amount equal to the magnitude of the 
skew angle to provide a de-skewda first digital representation; 

performing a second object grabbing operation on the de-skewed first 
to create an object list of all object images of the de-skewed first digital 

/ (e) identifying a portion of the de-skewed first digital representation 
corresponding to a picture region of the document; 

/ (f) identifying objects representing essential data images of the document and 

marking the identified objects as data objects; and 

/ (g) constructing the cleaned-up digital image of the document by 

i. combining the objects in the picture region and the marked data 

^tyects, and ' 



V (d) 

digital representatic 
representation; 
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21 ""St: eferiftating~all objects not marked as data ubjeels tu uij^kte TT ~* 

2 2 reconstructed digital representation of the essential images without the^ucrfse images. 

V ^> 6. The method of Claim 5 including performing the first object grabbing operation 

2 f by obtaining serial runlength data/from the first digital representation including slices that each 

3 include the length and ending pixel number of a string of connected pixels having a " 1 " value, 

4 operating line-by-line oi/the runlength data by means of a decision tree classifier that creates 
T| software objects induding a first linked list of a number of further linked lists each of which 
\f contains all of the slices of an object image, entering the slices of the object image into a 

T r ,~i / 

c / 

15? software framfe in the same order in which the slices are scanned, determining if the object image 

f 8 can be represented as a trapezoid or as an irregular blob containing all of its slices, fitting the data 

*T? in the scmware frame representing the object image into a decision tree classifier, and operating 

Ep the classifier to recognize and assign identifiers to divergences, convergences, and open ends of 

11 theyobject image and create a new linked list of linked lists representing the object image in the 

12 fpvm of blob records, trapezoid records, divergence records, and/convergence records which then 

13 / can be conveniently used in subsequent vectorizationoperlitions without the need to scan and 

14 / recognizedata^epfcsenting Lhe Object image. 
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le methoa of Claim 6 including performing the second object grabbi 
operation by performing steps which are essentially similar to the steps of the^fst object 
grabbing operation. 



8. The method of Claim 5 whereiVin step (c) the line has the predetermined 
relationship to a plurality of text objects m a row of text objects. 




line. 



The method'of Claim 8 wherein some of the text objects are centered about the 



10. / The method of Claim 8 including building the row of text objects by successively 
adding any Closest nearby text object to either end of a row initially including a first text object. 



11. 



relations! 



The method of Claim 6 wherein in step (c) the line has the predetermined 
a plurality of geometric objects identified byvecTSrraag^b^ larger than a 
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prpHpt ^rminRd^ i^e-^rtehHWinp near-horizontal lines and near-vertical lines uftlic veCLOli^^d- 
objects, and selecting a value of the skew angle which minimizes the me^fiare deviation of 
the near-horizontal and near-vertical lines from orthogonality. y< 

1 12. The method of Claim 5 including, after step (b), classifying at least a portion of 

2 the document as a type including rtfainly text objects or mainly geometric objects by producing a 
^3 first reduced-resolution representation of the first digital representation and performing another 
I| object grabbing operatioi^on the first reduced-resolution representation to identify objects of the 

\5 first reduced-resolution representation, determining the numbers of text-character-shaped 

CIl / 

p> rectangular object? and geometric objects thereof, respectively, and classifying the document as 
* 7 text type if the /umber of text-character-shaped objects is greater than the number of geometric 
objects, and/otherwise classifying the document as geometric type. 



1 / 13. The method of Claim 5 wherein step (f) includes forming a row of text including 

2 tej^t objects near to each other and having heights within a predetermined range, and markijig all 

3 ^J^jeCfimages produced according to stejT{b)-withii^^ row as data objects^ 
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1 14, T Vip TT| ^thnd. nf rinirrH 1 i i it ' .ln rtinp; identifying the oh)ect images which are~ 

2 geometric objects and marking them as data objects. y< 

1 15. The method of claim 14 wherein th^dentifying of geometric objects includes 

2 identifying only objects which have sufficiency high density and a sufficiently large aspect ratio 

3 as geometric objects. / 

W 7 / 

EL 16. The method of Claim 14 wherein the identifying of geometric objects includes 

^ 2 identifying whole geometric objects by getting a next object image having a size greater than a 
^ predetermined text/ize, and, if the next object image has a density lower than a predetermined 
3 density, performing a neural network operation to determine if the next object image is a whole 
geometry object, and, if the neural network operation determines that the next object image is a 
6 whole geometry object, marking the next object image as a data object. 

1 / 17. The method of Claim 14 wherein the identifying of geometric objects includes 

2 identifying broken geometry objects by perform ing a quad tree operation on all object images not 

3 prXtt5usl)Tid^^ as either text objects or geometriTobje^ object, 
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4 ^TVj fPp^ft ^^ V '^^■> ^Vr>g Qny nparhy nnn-m^ All alljpi.K nf oimilnr diap^ | A a ttempt tO exteild_ 

5 a pattern of similar non-marked objects in opposite directions from the non-mjujsetfobject, and 

6 marking all objects included in the pattern as data objects. 



1 18. The method of Clain/17 including computing a pattern confidence level, and 

J2 marking the objects included irj/me pattern only if the confidence level exceeds a predetermined 
ft3 level. 



i^l 19. /The method of claim 14 including identifying any object images which constitute 

j!l2 dashed line/ or dotted lines and marking such identified object images as data objects, by 

T ~^3 creating /grid of the wide, short rectangles or a grid of tall, narrow rectangles covering at least a 

4 portion of the document, summing the areas of all dash-sized for dot-sized objects into 

5 appropriate rectangles, eliminating objects in the appropriate rectangles having sufficiently small 

6 arp sums, obtaining a histogram all objects in the appropriate rectangles by area and x- 

7 Coordinate or y-coordi nate, and mark ing each object having a sufficiently large histogram pe^k 

8 / andJoeSfed between predetermined coordinate boundshaTa^ 
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7 0, Thnmnthrt il ii miH i m S wheift in the prnrtnrjng or the reHnr^rt- ^knlu linr 

representation of the de-skewed first digital representation includes representing each tile of four 
adjacent pixels of the first digital representation as a single pixej^d setting the single pixel to a 
"1" state if any of the four adjacent pixels of the tile is zX/r\" state and otherwise setting the 
single pixel to a "0" state. 



2 1 . The method of Claim 20 including producing a first reduced-resolution 
representation of the first digitaUepresentation by representing each tile of four adjacent pixels 
of the reduced-resolution representation as a single pixel and setting that single pixel to a "1" 
state if any of the four adja^nt pixels of that tile is at a "1" state and otherwise setting that single 
pixel to a "0" state. 



22. A method for producing a cleaned-up digital image of a document including 
essential data images and undesired noise images, comprising: 



(a) / digitally scanning the document to produce a first digital representation of 
the data images and the noise images; 



01 



ing a processor to perform a firs! 



j^ct grabbing operation on the 



6 firsliiig&^^ lu identify all object images thereof ; -— ^ 

7 (c) operating the processor to determine a skp^angle of a straight line having 

8 a predetermined relationship to at least some objects representative of essential data and to de- 

9 skew the document by rotating the first digital representation by an amount equal to the 
1 0 magnitude of the skew angle to provide a de-^ewed first digital representation; 

l\ r\ (d) operating the processor to perform a second object grabbing operation on 

sj4j/\ the de-skewed first digital representation to create an object list of all object images of the de- 
SS* Skewed first digital representation; 

£1 (e) operating the processor so as to identify a portion of the de-skewed first 

1 5 digital representationycorresponding to a picture region of the document; 

%£ (i) operating the processor to produce a reduced-resolution representation of 

17 the de-skewe(J first digital representation and to perform a second object-grabbing operation on 

1 8 the reduced^ resolution representation; 

19 /(g) operating the processor to identify objects of the reduced-resolution 
2 0 representation representing essential data areas of the document; and 

21 / (l^,^ccm ^^oting the -cleaned-up digital image of the document by operating 

2 2 ym^Jfocessor to perform a logical ANDing operation betwe6n4tiepicture region and the data 

6? 



mrr . mi lli l ln ill ,1 rvr rri fint digital rrp i^ j PiiH i lini i I n i limi i mte nil nhjf cts outside of thejict ure 
region and the data areas to provide the cleaned-up digital image. 
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23. A method for producing a cleaned-yp digital image of a document including 
essential data images and undesired noise images, comprising: 

(a) digitally scanmng the document to produce a first digital representation of 
the data images and the noise images; 

(b) opening a processing system to perform a first object grabbing operation 
on the first digital representation to identify all object images thereof; 



(<$ operating the processing system to determine a skew angle of a straight 
line having a predetermined relationship to at least some objects representative of essential data 
and to de-slcew the document by rotating the first digital representation by an amount equal to the 
magnitude of the skew angle to provide a de-skewed first digital representation; 

(d) operating the processing system to perform a second object grabbing 
deration on the de-skewed first digital representation to create an object list of all object images 



o£th6ae-skewed first digital repfeseiUation; 
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-(e) — operating the proce s sing-system so as to identify a portion o f t h e d e^ 
skewed first digital representation corresponding to a picture region of the docmp^nt; 

(f) operating the processing system to identift^objects representing essential 
data images of the document and mark the identified objepts as data objects; and 

(g) constructing the cleaned^ digital image of the document by operating 
the processing system to / 



objects to provide the cleanecLnip digital image, and 

u. eliminate all objects not marked as data objects to provide a 
reconstructed digital/representation of the essential images without the noise images. 



2f. A system for removing noise from a digital representation of data images and 
noise images produced by digital scanning of a document, comprising: 



\ 



1. 



comnine the objects in the picture region and the marked data 



fepresen] 



(a) processor means for performing an object grabbing 
fo"detect all objects oTtH^rineu mfint: ^ — ~ 00000 ^ 



•eration on the digital 

i 



# * 

5 (ft) processor means fui identifying o bjecis that represent c^^culiHl^-— 

6 information of the document and marking them as data objects; and 

7 (c) processor means for reconstructing ^digital representation of a reduced 

8 noise version of the document consisting of all of th^marked data objects. 

j\/ / 25. A system for renaming noise from a digital representation of essential images and 

•r? noise images produced by scanning of a document, comprising: 



CB (a) /processor means for performing an object grabbing operation on the digital 

* 4 representation to identify adjoining pixels which form objects; 

]Ze> /(b) processor means for identifying a data area portion of the digital 

~6 representation that includes the objects which constitute the essential images; 

7 / (c) processor means for performing a logical ANDing operation between the 

8 dataarea portion and the digital representation to provide a digital representation of the essential 

9 images without the noise images; and 
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—26 A uyu t uii for 1 removing noise from a digital representation of essential ir 

noise images of a document, comprising: 



(a) a scanning device for digitally scanning the document to produce the 
digital representation of the essential images and t)^noise images; 

/ (b) a processing system including a program stored m the processing system 

for performing an object grabbing'operation on the digital representation to identify adjoining 
pixels which form objects; 



(c) / a program stored in the processing system for initially marking all of the 
objects as noise images; 

(d) a program stored in the processing system for identifying data objects of 
the digital representation that constitute essential images and marking the identified data objects 
as daja objects; and 



(e) a program stored in the processing system for eliminating all objects not 
f marked as data objects from the digital representation to provide a reconstructed digital 
repre^atafionof the essential imageswittrottU^ienoise images. 
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"TT A system for removing ' flfrise from a aigiiai representation of essential image 

noise images of a document obtained by digitally scanning the document, comgjisifig 




(a) a processing system including a prpgfam stored in the processing system 
for performing an object grabbing operation on^Jire digital representation to identify adjoining 
pixels which form objects; 

(b ) a mxfgram stored in the processing system for identifying data objects of 
the digital representation that constitute essential images and marking the identified data objects 
as data objectSy^nd 



(c) a program stored in the processing system for eliminating all objects not 
mailed as data objects from the digital representation to provide a reconstructed digital 
bpresgntationof the essential images without the noise images. 
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